Principal Direction Linear Oracle for Gene Expression Ensemble Classification

نویسندگان

  • Leif E. Peterson
  • Matthew A. Coleman
چکیده

A principal direction linear oracle (PDLO) ensemble classifier for DNA microarray gene expression data is proposed. The common fusion-selection ensemble based on weighted trust for a specifier classifier was replaced with pairs of subclassifiers of the same type using PDLO to perform a linear hyperplane split of training and testing samples. The hyperplane split forming the oracle was based on rotations of principal components extracted from sets of filtered features in order to maximize the separation of samples between the pair of miniclassifiers. Eleven classifiers were evaluated for performance with and without PDLO implementation, which included k nearest neighbor (kNN), naı̈ve Bayes classifier (NBC), linear discriminant analysis (LDA), learning vector quantization (LVQ1), polytomous logistic regression (PLOG), artificial neural networks (ANN), constricted particle swarm optimization (CPSO), kernel regression (KREG), radial basis function networks (RBFN), gradient descent support vector machines (SVMGD), and least squares support vector machines (SVMLS). PLOG resulted in the best performance when used as a base classifier for PDLO. The greatest performance for PLOG implemented with PDLO occurred for tenfold CV and 100 rotations of PC scores with fixed angles for hyperplane splits. Random rotation angles for hyperplane splits resulted in reduced performance when compared to rotations with fixed angles.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Classifier Ensemble Framework: a Diversity Based Approach

Pattern recognition systems are widely used in a host of different fields. Due to some reasons such as lack of knowledge about a method based on which the best classifier is detected for any arbitrary problem, and thanks to significant improvement in accuracy, researchers turn to ensemble methods in almost every task of pattern recognition. Classification as a major task in pattern recognition,...

متن کامل

A Pre-Trained Ensemble Model for Breast Cancer Grade Detection Based on Small Datasets

Background and Purpose: Nowadays, breast cancer is reported as one of the most common cancers amongst women. Early detection of the cancer type is essential to aid in informing subsequent treatments. The newest proposed breast cancer detectors are based on deep learning. Most of these works focus on large-datasets and are not developed for small datasets. Although the large datasets might lead ...

متن کامل

Feature Selection and Classification of Microarray Gene Expression Data of Ovarian Carcinoma Patients using Weighted Voting Support Vector Machine

We can reach by DNA microarray gene expression to such wealth of information with thousands of variables (genes). Analysis of this information can show genetic reasons of disease and tumor differences. In this study we try to reduce high-dimensional data by statistical method to select valuable genes with high impact as biomarkers and then classify ovarian tumor based on gene expression data of...

متن کامل

Optimum Ensemble Classification for Fully Polarimetric SAR Data Using Global-Local Classification Approach

In this paper, a proposed ensemble classification for fully polarimetric synthetic aperture radar (PolSAR) data using a global-local classification approach is presented. In the first step, to perform the global classification, the training feature space is divided into a specified number of clusters. In the next step to carry out the local classification over each of these clusters, which cont...

متن کامل

Principal Component Analysis in Very High-dimensional Spaces

Principal component analysis (PCA) is widely used as a means of dimension reduction for high-dimensional data analysis. A main disadvantage of the standard PCA is that the principal components are typically linear combinations of all variables, which makes the results difficult to interpret. Applying the standard PCA also fails to yield consistent estimators of the loading vectors in very high-...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007